Skip to content

feat: remove mode argument from Semble#119

Merged
stephantul merged 4 commits into
mainfrom
remove-mode
May 20, 2026
Merged

feat: remove mode argument from Semble#119
stephantul merged 4 commits into
mainfrom
remove-mode

Conversation

@stephantul
Copy link
Copy Markdown
Contributor

This PR removes the mode argument from the search. The mode just led to weaker results because it disabled the ranking pipeline. Because we also expose alpha, users can safely choose full bm25 or full semantic by setting alpha to 0.0 or 1.0, and still get the benefits of the ranking pipeline.

For ablations, I opted to remove the model results as well, I think the results are still pretty clear and actionable, although I can see the value in patching it for evaluation purposes (e.g., maybe disabling the reranking with a private boolean?)

@stephantul stephantul changed the title Remove mode feat: remove mode argument from Semble May 19, 2026
@stephantul stephantul requested a review from Pringled May 19, 2026 18:44
@codecov
Copy link
Copy Markdown

codecov Bot commented May 19, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

Files with missing lines Coverage Δ
src/semble/__init__.py 100.00% <100.00%> (ø)
src/semble/cli.py 100.00% <100.00%> (ø)
src/semble/index/index.py 100.00% <100.00%> (ø)
src/semble/mcp.py 100.00% <100.00%> (ø)
src/semble/search.py 100.00% <100.00%> (ø)
src/semble/types.py 100.00% <ø> (ø)

... and 1 file with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Copy Markdown
Member

@Pringled Pringled left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Just some small questions.

Comment thread src/semble/search.py Outdated
return [SearchResult(chunk=chunks[i], score=float(scores[i])) for i in indices if scores[i] > 0]


def search_hybrid(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO we could even just rename this to search to make it clearer that we just support one search (since the way I read the function name as is implies it's just hybrid search but it's hybrid search + all our tricks, search reads more natureally to me).

# alpha=0.0 → hybrid pipeline, BM25-only input
# alpha=1.0 → hybrid pipeline, semantic-only input
_MODE_PARAMS: dict[str, tuple[SearchMode, float | None]] = {
"bm25": (SearchMode.BM25, None),
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also need to update the actual results I guess if we do this? The readme etc? I'm a bit torn on removing these, since specially bm25 is a very strong and standard baseline to report. But we can just call the private functions here to to keep supporting these ablations I think, wdyt?

@stephantul stephantul requested a review from Pringled May 20, 2026 10:05
@stephantul stephantul merged commit ef7d3fe into main May 20, 2026
16 checks passed
@stephantul stephantul deleted the remove-mode branch May 20, 2026 10:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants